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1.  INTRODUCTION 


The  von  Mises  distribution,  in  2-dimensions,  and  the  Fisher 
distribution,  in  3-dimensions,  have  been  extensively  used  in  recent 
years  to  describe  directional  data.  In  this  paper  we  give  the  theory 
of  the  von  Mises  distribution  in  p-dimensions,  and  suggest  some 
possible  applications.  Some  results  along  these  lines  were  given 
by  Watson  (1956)  and  by  Watson  and  Williams  (1956) ,  but  the  first 
complete  extension  to  p-dimensions  was  given  in  the  author's  Ph.D.  thesis 
(Stephens,  1962a,  Chapter  9).  It  was  later  issued  as  a  technical  report 
(Stephens,  1962b)  and  some  results  have  since  been  reproduced  by 
Mardia  (1975)  and  by  Degerine  (1977) ;  since  these  sources  are 
relatively  inaccessible,  we  begin  with  a  summary  of  results,  taken 
essentially  from  Stephens  (19€>2a,b)  .  in  later  sections,  we  develop 
some  new  techniques  for  the  analysis  of  data,  and  illustrate  with 
an  application  to  data  recorded  as  a  set  of  continuous 
proportions . 

2.  THE  VON  MISES  DISTRIBUTION  IN  p-DIMENSIONS 

2.1.  The  von  Mises  Distribution. 

A  typical  sample  item  is  recorded  as  a  unit  vector  from 
the  center  0  to  a  point  P  on  the  surface  of  a  hypersphere,  of 
unit  radius,  in  p-dimensions.  A  typical  sample  then  consists  of  the 
points  P^,  i  =  1, . . . ,N  ,  or  equivalently  the  vectors  OP^  .  When 
p  =  2,  the  points  are  on  a  circle,  and  when  p  =  3,  they  are  on 
a  sphere.  The  vectors  can  then  denote  directions,  e.g.  of 


prevailing  winds,  flights  of  birds,  or  magnetization  of  rocks. 
Until  now,  it  is  in  these  context  that  the  distribution  has  been 


extensively  used.  Let  the  unit  vector  OP  ,  called  v  ,  have 

coordinates  x, ,x_,...,x  in  a  suitable  rectangular  system. 

1  2  p 

It  will  also  be  useful  to  use  polar  coordinates,  consisting,  in 
general,  of  the  radius  r  (here  r  *  1) ,  and  angular  coordinates 
e,,e„,...,e  The  relations  between  the  two  sets  of  coordinates 

1  2  p-1 

are 

x^  *  cos  0^^  , 

j“l 

x .  »  cos  0 .  E  sin  0.  , 

3  3  i-1 

p-1 

x  »  II  sin  0.  . 
p  i-i 

The  von  Mises  density  is  symmetrical  around  the  modal 
vector  OA  ;  for  convenience  in  analyzing  the  distribution,  we  place 
this  vector  along  0^  =  0  .  The  density  per  unit  area  on  the 
hypersphere  is  then  proportional  to  exp (k  cos  0^)  where  k  is  a 
concentration  parameter.  The  joint  density  function  of  the  0^  is 

f(0  ,0  ,...,0  )*C  (k)exp(kcos0.) sinp-201sinp_30o. . .sin0 

1  2  p-1  p  112  p-2 

over  the  range  of  0 ^  . 


The  constant  term  is 


where  q  is  written  for  p/2-1  and  where  Im(k)  is  the  imaginary 
Bessel  function  of  order  m  and  argument  k  .  When  k  is  0  , 
Cp(0)  becomes  T (p/2)/ (211) and  the  density  is  uniform  over  the 

unit  hypersphere;  the  concentration  around  QA  increases  with  k  . 
With  the  density  as  described,  is  the  component  of  OP  on  the 

modal  vector  and  81  is  the  angle  between  OP  and  the  modal  vector. 
An  orthogonal  transformation  allows  the  density  to  be  transformed 
to  place  OA  along  any  chosen  vector,  but  the  general  form  is  then 
very  complicated. 


2.2.  Notation  for  vectors  and  related  statistics. 

In  a  suitable  rectangular  system,  let  vector  wi  have 
components  x. . ,x. . . . ,x.  ,  written  w.  *  (x. , ,x. _, . . . ,x.  ).  Two 

U  12  ip  1  11  12  ip 

calculations  involving  vectors  are  often  needed;  the  scalar  product 
of  two  vectors  and  the  length  of  a  vector.  The  scalar  product 
sfw^jWj)  of  vectors  w^  and  w^  is  defined  as 


s(w.,w.)  »  x. _ x , .  +  x.,x._  +  ...  +  x.  x. 
i  J  il  jl  U  j2  ip  jp 


(2) 


and  the  length  l ^  of  w^  is  given  by  if  ■  sfw^w^) 


2  2  2 
xil  xi2  '*•  xip 


A  vector  w^  is  reduced  to  unit  length  by 


dividing  by  its  length;  its  components  are  then  (x.,/l.,  x. ~/l . , . . . ,x.  /I.) 

11  1  12  1  ip  1 

The  scalar  product  is  also  s(w^,w^)  *  cos  where  is 


the  smaller  angle  between  vectors  w^  and  w^  .  The  values  of  s(w^,w^) 
and  of  l ^  are  not  dependent  on  the  coordinate  system  used.  Vectors 
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and  their  lengths  are  often  denoted  by  the  same  letter  but  with  the  vector 
printed  boldface;  for  ease  of  printing  we  shall  mostly  not  use  this 
convention,  i.e.,  vectors  will  not  be  written  boldface,  except  to 
distinguish  between  a  resultant  vector  R  and  its  length  R  . 

Suppose  a  sample  of  N  unit  vectors  is  given,  consisting  of 
vectors  OP^  =  v^,  i  =  1,...,N;  a  typical  vector  v^  has  components 

(Xil»xi2 , • ,Xip) ,  and  the  polar  coordinates  are  ‘ '6i(p-l) ^ ' 

the  resultant,  or  vector  sum,  of  the  set  of  N  vectors  has  components 


The  resultant  is  denoted  by  R  =  (X, ,X. , . . . ,X  )  and  its  length  by  R; 

~  1  Z  p 

2  2  2  2 

thus  R  =  X1  +  X2  +  ...  XQ  .  The  length  of  the  component  of  R  on  the 


modal  vector  OR,  when  this  is  known  or  hypothesized,  is  often  used 
and  will  be  denoted  by  X  with  no  subscript.  Suppose  OA  has  unit 
length,  and  components  a, ,a„, _ ,a  ;  then 

12  p 

X  *  s (OA,R)  »  a,X,  +  a„X„  +  ...  a  X  (3) 

~  11  2  2  p  p 


The  value  of  X  will  also  be  independent  of  the  coordinate  system. 


A 


The  statistics  R  ,  R  and  X  are  all  important  statistics  for 
the  analysis  of  a  sample.  For  example,  the  maximum  likelihood  estimator 
(MLE)  of  the  direction  of  the  modal  vector  is  the  direction  of  R  and 

A 

the  MLE  k  of  the  concentration  parameter  k  is  given  by  the 
equation 


W1 

y*> 


R 

N  ' 


(4) 


with  q  -  p/2-1  as  before.  For  k  large  this  equation  becomes 


1  2k  N  1  *  2<N-R)  * 


(5) 


If  OA  is  known,  the  component  X  replaces  R  in  (4) 


and  (5). 


When  several  samples  of  unit  vectors  are  given,  questions 
arise  whether  they  have  the  same  modal  vectors,  the  same  concentration 
parameters,  etc.  Let  the  i-th  group  have  modal  vector  OA^  ,  and  con¬ 
centration  parameter  k^  .  Let  v^,  j  *  1,...,N^  ,  be  the  set  of  unit 
vectors  in  the  i-th  group,  so  that  JT  is  the  number  of  vectors  in 
the  group,  and  let  R^  be  the  length  of  the  resultant  vector  R^  of  the 
group.  Let  N  -  I . ,  and  let  R  be  the  length  of  the  resultant  R  of 
all  the  vectors  treated  as  one  large  group. 
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2.3.  Distributions  of  statistics  X  and  R. 


In  the  applications  to  be  described,  we  shall  mostly  need 
approximations  to  the  distributions  of  X  and  R  which  hold  when 
k  is  large;  however  for  completeness  the  exact  distributions  of 
these  two  statistics  will  be  given.  Suppose  J  (t)  and  I  (t)  are 

q  q 

respectively  the  usual  Bessel  function  and  the  imaginary  Bessel 
function  of  order  g  ,  and  let 


and 


M  (R)  =  /  tq(1-N)2-qJ  (Rt){J  (G)Nt  dt 

1  o  q  q 


AlSo,  put 

C1  *  {kq/Ig(k)}N  and  C2  =  {Ar(q  +  1/2)  }_1  , 


The  densities  of  R  and  of  X  are  (Stephens,  1962a) 

f.  (R)  =  C.C_I  (k)I^/2  M,  (R)  0  5  R  S  N 

1  1  4  g  1 

f2(X)  *  (Cj/ir)  e**  M2  (X)  .  -N  <  X  <  N  . 


These  densities  involving  Bessel  functions  simplify  for  odd  values 
of  p .  Stephens  (1962a ,b, 1967, 1969a)  discusses  the  densities  for 
p  *  2  and  3  in  much  greater  detail  and  uses  them  to  find  tests  for  k 
Stephens  ( 196 2a, b)  •  also  gives  the  conditional  density  of  R  given  X 
in  p-dimensions ;  this  is  independent  of  k  as  was  earlier  shown  by 
Watson  for  p  «  2  and  3,  and  so  can  be  used  for  a  test  that  the  modal 
vector  is  along  a  given  vector  OA^  ,  when  k  is  unknown.  The  tests 
for  p  •  2  and  3  are  described  in  detail  in  Mardia  (1972)  and  in 
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r 


Section  9  of  Volume  2  of  Biometrika  Tables  for  Statisticians?  see 
also  Stephens  (1975)  for  a  test  for  the  modal  vector,  when  k  is 
known.  Other  exact  results  for  p-dimensions  are  given  in 
Stephens  (1962a, b) . 

When  k  =  0,  the  densities  above  simplify  considerably.  The 
vectors  are  now  uniformly  distributed  on  the  hypersphere,  and  the  length 
R  is  the  length  of  the  final  displacement  of  N  unit  steps  in  a 
random  walk  from  O  in  p-dimensions.  For  this  reason  it  has  attracted 
attention  long  before  the  use  of  the  von  Mises  and  Fisher  distributions. 
For  the  exact  densities,  and  for  further  references,  see 
Stephens  (1964,  1969b) .  When  k  =  0,  a  simple  approximation  exists 
for  the  distribution  of  R.  Suppose  Z  =  pR2/N  for  large  N;  2  is 
approximately  x2  distributed  with  p  degrees  of  freedom.  This 
result  can  be  used  to  provide  a  test  for  uniformity  based  on  R; 
the  test  has  come  to  be  known  as  Rayleigh's  test.  A  recent  paper 
(Prentice,  1978)  surveys  tests  for  uniformity  in  p-dimensions. 


2.4.  Properties  of  the  von  Mises  density  with  large  concentration 
parameter  k  . 

When  k  is  large,  there  are  useful  approximations 

concerning  the  densities  of  R  and  of  X.  There  is  then  a  high 

2 

probability  that  0^^  Will  be  small  and  so  cos  0^  w  1  -  0^  /2 
and  sin  0^  w  0^  .  The  density  of  0^  becomes 
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f{61)  c*  Cp(k)ejq>(k)exp(-k012/2)61P"1,  0  £  61  <  it  ,  (6) 

2  - 

so  that  the  quantity  k9^  has  approximately  a  xz  distribution 
with  p-1  degrees  of  freedom.  Since  =  cos  0^  ,  we  write 

k012  =  2k (1  -  cos  ei)  =  2k (1  -  x^  X  Xp_x  •  (7) 

Because  of  the  symmetry  around  the  modal  vector,  the  other  coordinates 

Xj,  j  »  2,  have  identical  distributions;  for  large  k  these  are  approximately 

normal  with  mean  0  and  variance  1/k  .  For  a  tightly  clustered  sample 

of  vectors  v^  we  expect  R  to  point  fairly  accurately  along  the  modal 

vector  OA  ,  and  the  length  of  R  to  be  relatively  large.  If  OA 

is  known,  the  projection  X  of  R  on  OA  will  also  be  large.  Then 

clearly  N  -  X  and  N  -  R  are  both  measures  of  the  dispersion  of 

the  set  of  vectors.  For  large  k  ,  we  have  from  (7),  with  r  =  p-l, 

E.  2k  (1  -  x  )  sa  xi  ;  this  gives  2k(N-X)  **  • 

i  li  Nr  Nr 

Further,  the  distributional  results  for  x. ,  j  =  2,...,p,  lead 

2  2  2 

to  the  approximate  distribution  k(R  -  X  )/N  xr  •  since  R  sa  X  ss  N, 

2 

this  becomes  2k(R-X)  ss  xr 

2.5.  Tests  for  the  modal  vector  and  for  k. 

Watson  (1956)  and  Watson  and  Williams  (1956)  have  used  these 
identities  to  devise  a  technique  of  analysis  for  large  k  ,  which  is 
analogous  to  the  usual  one  way  analysis  of  variance  for  continuous 
variables.  Watson  writes  the  identity 

2k(N-X)  =  2k  (N-R)  +  2k(R-X)  ,  (8) 


which,  by  analogy  with  the  analysis  of  variance,  becomes,  in 
p-dimensions 

2  2  ^  2 
XNr  ”  X(N-l)r  +  Xr  * 


(9) 
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This  leads  to  the  approximation  for  the  statistic  Z^ 


(N-l)  (R-X) 
N-R 


%  F. 


r , (N-l) r 


where  F  is  the  F  distribution  with  s  and  t  degrees  of  freedom. 
s,t 


Watson  suggested  the  use  of  statistic  Z^  in  two  or  three 

dimensions,  to  examine  whether  a  given  vector  0Aq  is  the  modal 

vector.  On  the  null  hypothesis,  both  R  and  X  will  be  known, 

and  the  hypothesis  will  be  rejected  for  larg<  values  of  Z^  , 

indicating  that  X  is  much  smaller  than  R  »  in  that  case  R  does  not 

point  in  the  direction  of  the  vector  0Aq  .  Tests  of  the  null 

hypothesis  that  k  is  a  given  value  kQ  will  be  based  on  the 
2 

X  approximations  given  above  for  2k(N-X) ,  when  OA  is  known,  or  for 
2k (N-R) ,  when  OA  is  not  known.  Stephens  (1967,  1969a)  examined  these 

tests  for  p  =  2  and  3  and  found  them  to  be  very  good  even  for  quite 
low  values  of  k  .  They  will  certainly  be  valid  for  the  large  values 

of  k  which  arise  in  the  applications  below. 

2.6.  Comparison  of  several  modal  vectors. 

Suppose  s  different  samples  of  unit  vectors  are  given  and 

we  wish  to  test  whether  all  the  samples  come  from  populations  with  the 

same  modal  vector,  assuming  they  have  the  same  value  of  k  .  On  the 

2 

null  hypothesis  we  again  use  the  x  approximation  for  2k (N-R) ,  and 
apply  this  result  to  the  individual  samples  as  well  as  the  entire  group 
taken  as  a  whole.  We  write  the  following  identity 

2k (N-R)  -  2k (N-R.)  +  2k (N-R.)  +  ...  +  2k(N  -R  )  +  2k (R,  +R-+. . .+R  -R)  (10) 
11  22  SS  12  S 

and,  again  by  analogy  with  the  analysis  of  variance  we  obtain 
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and  2k  (R.,+R  +.  .  ,+R  -R) 
12  s 


2 

X (s-1) r  ' 


(11) 


x(N.s)r 


where  again  r  =  p-1;  hence  the  statistic 


(N-s) (E.R.-R) 

z  =  _ 1  1 

2  (s-1) (N-S . R. ) 


will  have  approximately  the  F  distribution  with  (s-l)r  and  (N-s)r 
degrees  of  freedom.  Therefore  to  test  whether  the  different  groups 
have  the  same  modal  vector,  the  statistic  Z^  is  calculated  and 
compared  with  this  F  distribution.  Large  values  of  Z^  will  be 
significant,  indicating  that  the  R  ^  vectors  point  in  different 
directions. 

The  above  analysis  is  essentially  a  one-way  analysis  of 
variance  which  can  be  set  up  in  the  usual  tabular  form; 

Variance  Components  d.f. 

Between  groups:  “  R  :  (s-l)r 


Within  groups  :  N  -  E^R^  :  (N-s) r 


Total  :  N-R 


(N-l)r 


Note  that  throughout  the  table  2k  has  been  omitted  before  the 

variance  components;  since  only  ratios  will  be  used  for  tests 

this  does  not  effect  the  calculations.  This  is  analogous  to  omitting 
2 

a  xn  the  terms  of  squares  of  an  ANOVA  table.  In  a  later  section 
we  shall  give  an  extension  to  the  above  analysis  which  can  be  used 
when,  for  example,  the  groups  of  vectors  can  be  classified  according 
to  two  criteria. 


/ 
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2.7.  Tests  for  constant  k. 


In  the  variance  component  analysis,  k  is  assumed  to  be 
constant  for  each  group,  analogous  to  the  assumption  of  constant 
variance  in  an  analysis  of  variance  table.  For  large  k  this 
assumption  can  be  tested  as  follows.  Suppose  in  general,  there  are 
s  groups  of  vectors,  and  it  is  required  to  test  for  constant  k  for 
all  groups.  For  group  i  ,  calculate  =  N^-R^  and 

q.  =  (N.-l)r  =  (N.-l)(p-l);  let  T  =  £.Q.  and  t  =  Z.q.  .  Calculate 

i  1  11  l  l 

Z  =  t  in  T  -  E.q.  in  Q.  -  t  in  t  +  E.q.  In  q. 

3  li  l^i  i 

■“>  c  -  1  +  3(TT>  (Ei  <l  -  l/t)  > 

1  • 

finally  let  Z^  =  Z3/C.  The  hypothesis  of  constant  k  is  rejected  if  Z 3 

2 

is  significantly  large  compared  with  the  x  distribution  with  s-1 

f 

degrees  of  freedom.  Note  that  if  z^  is  not  significant,  Z^  will 
not  be  significant,  since  C  is  greater  than  1  .  This  test  is  based 
on  Bartlett's  test  for  homogeneity  of  variances.  Other  tests  for 

the  same  purpose  (e.g.  those  in  section  16  of  Biometrika  Tables  for 
Statisticians,  Vol.  1)  can  be  adapted  in  a  similar  way. 


3.  THE  ANALYSIS  OF  CONTINUOUS  PROPORTIONS. 


3.1.  Applic4tions  of  the  von  Mises  distribution  in  p  dimensions. 

The  von  Mises  distribution  in  p-dimensions  can  be  a  useful 
tool  of  analysis  of  multivariate  data,  where  the  variables  are  subject 
to  a  constraint  which  corresponds  to  the  fact  that  all  vectors  in  the 
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von  Mises  sample  have  unit  lengths.  An  example  is  the  analysis 

of  continuous  proportions,  where  the  typical  vector  v  has  components 

xl'x2'’*',xp  are  Proportions  of  a  continuum,  e.g.  time  or  volume 

or  mass.  Suppose,  for  example,  a  subject  records  the  proportion  of 

his  day  spent  in  p  different  activities;  let  these  proportions  be 

n,  ,tt_,  . . .  ,n  ,  and  let  x.  =  A.  .  Since  E.x.2  =  E.tt.  =  1, 

12  p  11  1111 

a  typical  activity  pattern  for  a  subject  can  be  recorded  as  a  unit 

vector  v  with  components  xj  *  x2 '  *  * " '  Xp  *  A  group  of  subjects 

will  be  denoted  by  a  set  of  points  on  the  unit  hypersphere,  and  the  population 

might  well  be  represented  by  a  von  Mises  density  around  a  central 

vector.  This  application  occurred  to  the  author  when  at  a  Conference  some 

years  ago  such  a  data  set  was  being  discussed,  gathered  from  the  activity 

patterns  of  over  200  students  at  Reading  University,  and  covering  over  60 

activities.  The  students  had  been  asked  to  compile  a  diary  for  one 

week,  giving  their  activities  for  every  fifteen  minute  period.  Thus 

the  vector  of  activity  pattern  was  available  for  each  student,  and  in 

addition  a  transition  matrix  was  available  giving  the  probability 

of  moving  from  activity  i  to  activity  j  .  The  exanqple  in 

Section  3.3  examines  a  similar  data  set,  but  much  simplified. 


3.2.  The  von  Mises  distribution  for  continuous  proportions. 


The  von  Mises  distribution  was  suggested  above  as  a  useful  model 

for  a  cluster  of  vectors  around  a  modal  direction.  For  continuous 

proportions  further  analytic  justification  can  be  given.  Suppose  for 

each  i  the  proportion  ik  recorded  by  a  subject  is  a  random  variable 

it .  =a.  +  €.;  a.  is  the  modal  value  for  the  population,  and  f.  a 
1111  A 

random  fluctuation  with  a  normal  distribution  with  mean  0  and  a  small 
2 

variance  0.  .  Since  E.it.  =  1,  and  E.a.  =  1,  the  constraint  0 

l  ii  ii  ii 

is  imposed  on  the  which  are  otherwise  assumed  independent.  The 
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typical  component  of  the  unit  vector  v  for  the  subject  is  x^  »  /ik  > 
and  the  modal  vector  OA  has  components  i  =  The 

scalar  product  s(OA,  v)  gives  cos  0^;  thus 

cos  0.^  =  Z^xVa,^  sj  E^fa^  +  ^/(2/aJ  -  €^2/{8(a^^2) }} 

using  the  binomial  expansion  of  x^  =  /(a^  +  .  In  the  simplest  case, 

2  2 

suppose  that  all  a^^  are  equal  to  1/p,  and  that  all  tj^  *  a  ,  a  constant. 

Then  cos  0^  becomes 

cos  0^  1  -  p3^2£^€^2/8  ; 

2  2  2  3/2 

and  since  cos  0  jy  1  -  0  /2  we  have  k.^0^  ***»  *  where  k^  =  4/p  . 

2  2  2 

However,  /o  has  the  Xp_^  distribution,  one  degree  of  freedom  being 

2  2  2  2  3/2 

lost  since  =  0.  Hence  k6^  &  Xp-1*  w**ere  ^  =  =  40  /P 

By  symmetry,  the  vector  component  of  v  which  is  not  along  OA  will  be 

2  2 

uniformly  distributed;  this  result,  together  with  kS^^  «  Xp_2'  indicates 

that  the  vector  v  has  the  von  Mises  distribution,  at  least  to  a  good 

2 

approximation.  When  a^  and  o ^  are  not  all  equal,  the  approximation 

still  holds  well  provided  we  can  take  E^f^/^a^  ^  0  and 
2  3/2  2 

Zi?i  /a^  k2xp-l  where  k2  is  a  constant.  These  approximations 

2  3/2 

will  be  good  if  o.  ~  a,  ,  a  reasonable  model,  and  the  von  Mxses 
distribution  will  be  probably  quite  robust  for  most  situations  provided 
the  a^  are  not  too  different  in  value. 

3.3.  Example . 

The  data  to  be  analyzed  concerns  the  proportions  of  time  spent  in  various 
activities  by  130  students  at  Simon  Fraser  University.  The  activities 
were  classified  in  8  ways:  sleeping,  attending  lectures,  studying, 
socializing,  travelling,  family  activities,  meals,  and  personal  activities 
and  the  students  were  asked  to  record  their  activities  for  one  day 
only.  Thus  the  sample  does  not  represent  the  overall  activity  pattern. 
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but  it  is  used  here  as  an  illustration  of  the  general  methodology.  The 

complete  data  set  is  available  from  the  author. 

The  first  step  in  analyzing  the  data  is  to  convert  the 

proportions  to  x-coordinates.  For  student  i  let  be  the 

proportion  of  time  spent  in  activity  j  ,  i  =  1,...,N,  j  =  l,...,p, 

and  let  x.  .  =  /ir .  .  .  From  the  x.  .  ,  the  component  X.  of  the  resultant 
*D  J-D  iD  D 

R  and  the  length  R  of  R  ,  are  calculated  as  described  in  Section  2.2. 

~  r  > 

We  first  use  the  analysis  of  variance  technique  to  examine 

whether  there  appears  to  be  a  difference  in  activity  patterns  between 

men  and  women.  The  data  set  is  divided  into  two  groups;  group  1  for 

women  and  group  2  for  men.  The  results  for  the  two  groups  are  given 

in  Table  1.  From  the  ANOVA  table  of  part  (a),  the  value  of  statistic 

Z2  is  0.61,  which  is  not  significant  at  a  =  .10  when  compared  with 

the  F  distribution  with  s  =  7  and  t  =  896.  With  such  a  large 
s ,  t 

value  for  t  ,  the  percentage  point  of  F,  at  upper  level  a  ,  is 

2  2 

excellently  approximated  by  xg(a)/s»  where  Xg(a)  is  the  upper  tail 

2 

percentage  point  of  xs  at  level  a  .  Here  the  value  required  would 
2 

be  x7(a)/7?  for  a  =  .10,  this  is  1.909.  Therefore  there  appears 

to  be  no  reason  to  suppose  the  activity  pattern  is  different  between 
men  and  women  students. 

The  data  were  next  examined  to  see  if  there  was  a  difference 
is  patterns  explained  by  the  style  of  living  arrangements  of  the 
students.  The  living  styles  were  classified  as;  1,  college 
residence;  2,  marriage  or  marriage  style;  3,  other,  e.g.  at 
home,  sharing  an  apartment,  renting  a  room,  etc.  The  results  are 
given  in  part  (b)  of  Table  1.  The  value  of  Z2  is  now  3.3,  to  be 
coiqpared  with  the  F  table  with  s  *  14,  t  =  889.  At  the  1% 
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level,  the  value  of  x^(*01)/14  is  2.08,  so  that  Z ^  is  significant 
at  this  level,  and  we  conclude  that  there  is  a  difference  between  groups 
classified  by  living  styles.  The  data  will  be  examined  in  greater  detail 
after  some  new  techniques  have  been  introduced. 


4 .  NEW  TECHNIQUES 


4.1.  Two-way  analysis  of  variance. 


In  this  section  we  give  some  new  techniques  for  the  examination  of 
data.  First  the  variance  component  analysis  of  the  preceding  section  is 
extended  to  a  two-way  layout,  and  the  student  data  is  again  used  for 
illustration.  In  section  4.3,  we  discuss  goodness-of-fit  to  the  von 
Mises  distribution,  and  in  section  4.4  some  techniques  of  clustering  and 
correlation  are  briefly  mentioned. 

Suppose  the  sample  items  (for  example,  students)  are  classified  in 
two  ways:  by  a  main  classification  1  with  I  groups,  indexed  by  i=l,...,I, 
and  by  classification  2  with  ^  groups  within  group  i  of  classifica¬ 
tion  1.  When  a  student  falls  into  group  i  of  classification  1  and 
group  j  of  classification  2,  the  associated  vector  of  activity  proportions 
will  be  placed  in  cell  (i,j)  in  row  i,  column  j,  of  a  two  way  table. 
Extending  our  previous  notation,  we  write  v^jjj  f°r  the  k-th  vector 
in  cell  (i,j).  Let  Ny  be  the  number  of  vectors  in  cell  (i,j),  and 
let  R  be  the  length  of  the  resultant  in  this  cell.  Let  N  be  the 
total  number  of  vectors  and  let  N*  be  the  number  of  non-empty 
cells.  Let  R  be  the  length  of  the  resultant  of  all 
vectors  in  row  i,  i.e.,  of  the  vectors  for  all  students  in  group  i  of 


the  first  classification,  and  suppose  R  is  the  length  of  the  resultant 


! 


of  all  the  vectors.  As  before,  write  r  =  p-1.  A  table  may  be  constructed  as 


in  Table  2(a).  By  extension  of  the  previous  analysis,  we  write  the 
following  identity 


2k(N-R  )  -  2k 


VV  +  2k  l,  tN2J -R2j >  +  •  •  •  +  2k  £  <  W 

^1  JI  I 

+  2k{  I  IL  )  -R 2k{  l  n)  -R>  +  2k(  l  R 
j*l  J  j=l  J  1  i-1  1 


collecting  terms,  we  have 
I  Ji 


2k(N-R..)  =  2k  l  l  (N  -R  )-+2k{  £  (R  -R  )}  +•••+  2k(  J  (R_  -R  )) 
i-1  j-1  iJ  j=l  x*  j=l  IJ  T* 


+  2k(  )  R.  -R  ) 


i=l 


with  corresponding  distributions,  for  large  k  : 


X(N-l)r  X(N-«*)r  +  x(j  -l)r+  ***  +  X(JT-l)r+  XU-l)r  ' 

The  terms  may  be  arranged  in  a  variance  component  table  as  in 
Table  2(b).  A  final  column  ("Mean  Component")  may  be  added, 
giving  the  value  of  the  variance  component  divided  by  its  degree  of 
freedom.  The  table  allows  us  to  examine  differences  between  rows, 
or  differences  between  columns  within  any  one  row;  thus  the  analysis 
will  be  similar  to  what  is  usually  called  a  nested  analysis  of  variance. 
To  test  the  null  hypothesis  HQs  that  there  is  no  difference  between 
rows,  we  calculate  the  quotient 

I 

(N-N*)<  l  R,  -R  ) 

i-l  i’  ” 

Z  =  - ^ - j -  (12) 

I  i 

(1  -  1)  (N  -  I  l  R.,) 
i-1  j-1 


which,  on  H  ,  hac  an  F-diatribut ion  with  (I-l)r  and 
o 

degrees  of  freedom.  The  null  hypothesis  is  rejected  for  a  significantly 
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large  value  of  Z^.  Similarly,  Co  test  the  null  hypothesis  Hq: 
there  is  no  difference  between  columns  within  row  i,  the  quotient 

(N-N*)(  l  Ry-R1#> 

Zs  *  - ^ - T -  (13) 

5  I  Ji 

(J-1)(N-  l  l  R  ) 
i-1  j  =  l 

is  calculated.  On  Hq,  Z,.  has  an  F-distribution  with  (J-l)r  and 
(N-l)r  degrees  of  freedom,  and  Hq  should  be  rejected  for  a  signifi¬ 
cantly  large  value  of  Z^. 

4.2.  Example . 

We  continue  with  the  example  already  begun  in  the  previous 
section.  The  original  sample  is  now  subdivided  by  both  sex  and  living 
arrangements,  making  a  total  of  six  cells.  Table  3  shows  the  san$>le 
size  of  each  cell  and  the  resultant  length  in  each  cell.  Also  shown 
in  the  table  are  the  resultant  length  for  each  row,  i.e.,  for  the 
males  and  for  the  females,  and  for  each  column,  i.e.,  for  the  three 
styles  of  living  arrangements.  Other  relevant  statistics  are  also 
given  for  use  with  Table  2(b).  Tables  3(a)  and  3(b)  give  the  tables 
for  two  analyses.  The  column  MC  gives  the  mean  component,  i.e., 
variance  component  divided  by  its  degrees  of  freedom. 

In  analysis  1,  we  first  test  if  there  appears  to  be  a 

difference  between  activity  patterns  for  men  and  women,  using  the 

variance  components  for  "between  sexes"  and  "within  groups".  The 

test  statistic  Z =  0.00857/0.01376  =  0.62  and  this  is  clearly 

not  significant  when  compared  to  the  F,  distribution.  The  next 

*  $  ooo 
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test  is  for  difference  between  living  styles  for  women  only.  The 

statistic  Z5  =  0.01486/0.01376  =  1.08,  and  this  is  also  not 

significant.  The  test  statistic  for  difference  between  living  styles 

for  men  is  =  0.0421/0.01376  =  3.06  •  this  is  significant 

2 

compared  with  F^4  86gte  X14/i4)  at  the  a  =  0.005  level.  Thus  the 
previously  noted  difference  between  activity  patterns  for  different 
living  styles  has  been  narrowed  down  to  a  difference  for  men. 

In  analysis  2,  the  difference  between  activity  patterns  for 
different  living  styles  again  shows  up  in  the  corresponding  Z4 
statistic,  Z4  *  0.043/0.01376  =  3.13  ;  but  for  a  difference  between 
sexes  within  each  living  style  there  is  no  significant  Z,.  statistic, 
confirming  the  results  already  found.  The  results  of  the  more  detailed 
two-way  analysis  are  consistent  with  each  other  and  sharpen  the 
conclusions  gained  from  the  one-way  analysis. 

Example  of  the  test  for  constant  k.  For  the  cells  in  Table  3 
* 

the  values  of  k  across  the  top  row  are  32.44,  36.63,  38.70,  and 
those  across  the  bottom  row  are  51.39,  36.53,  37.99.  The  test 
statistic  Z3  of  Section  2.7  has  value  Z^  =  3.44,  not  significant 


can  be  maintained. 

Example  2.  In  the  above  example,  the  classification  within  each  row 
was  the  same,  with  *  3  for  both  men  and  women.  However,  with  a 
nested  model,  ^  can  of  course  be  different  for  each  row.  We 
illustrate  with  a  second  example  with  data  kindly  provided  by  Dr.  Charles 
Jones  of  the  Dept,  of  Sociology,  McMaster  University.  N  “  232  respondents 
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were  each  asked  to  rate  35  ethnic  or  religious  groups  in  Canada,  by  the 

criterion  of  social  standing,  using  a  9-point  scale.  By  a  scaling 

device,  the  35-vector  of  replies  was  reduced  to  a  vector  in  3  dimensions. 

These  vectors  could  have  negative  components,  in  contrast  to  the  continuous 

proportions  data.  Thus  the  basic  data  set  consists  of  232  vectors  in  3 

dimensions;  these  were  divided  into  8  groups  according  to  ethnic  origin 

of  the  respondents.  The  eight  groups  have  been  put  into  four  rows 

indicated  by  the  nature  of  the  Canadian  population,  and  two  of  the  rows 

have  been  subclassified.  The  data  is  given  in  Table  4,  with  the  Analysis 

of  Variance.  For  a  test  of  significance  between  rows,  the  test  statistic 

is  Z2  **  1.28/. 063  -  20.31  and  is  highly  significant  when  compared  with 
2 

X,/ 6  at  the  0.005  level.  The  test  statistics  for  differences  in  groups 
6 

within  rows  2  and  4  are  respectively  *  .0306/. 063  and  Z^  -  .0419/. 063 
and  are  clearly  far  from  significant. 


4.3.  Goodness-of -Fi t . 

The  analysis  described  so  far  assumes  that  the  observations 
come  from  the  p-dimensional  von  Mises  distribution.  In  order  to  test 
this  assumption,  we  use  two  of  the  distributional  results  described 
in  Section  2.  The  results  are  the  distribution  of  the  angle 
0^  between  a  typical  vector  v  and  the  modal  vector  OA  ,  and  the 
distribution  of  the  component  of  v,  say  y  ,  at  right  angles  to  the 
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modal  vector.  Since  the  modal  vector  is  not  precisely  known,  0,  must 

be  estimated  by  ,  the  angle  between  v  and  the  resultant  R  of 

the  sample.  The  set  of  angles  is  tested  to  come  from  the  density 

2 

for  0^  given  in  equation  (  6  ),  using  the  usual  Pearson  x  test. 

Since  replaces  9^  ,  it  is  difficult  to  determine  exactly  the  degrees 

of  freedom  (though  k-2  might  be  indicated  for  a  test  involving  k  cells) 
and  more  examination  needs  to  be  made  of  the  test  in  this  case. 

However  the  statistic  will  be  a  helpful  guide  to  the  fit  of  6^  .  The 
components  y  ,  i  =  1,2,...,N,  at  right  angles  to  R  should  be  uniform 
on  the  hypersphere  of  dimension  p-1,  and  the  hypothesis  that  this  is  so 
can  be  tested  in  many  ways  (see  e.g.  Prentice,  1978) .  For  robustness 
of  the  analysis,  it  will  be  important  that  they  are  not  clustered 
around  a  single  mode  and  for  this  purpose  the  Rayleigh  test  is  indicated. 
The  vector  component  y^  is  found  as  follows.  Let  u  =  R/R  be  the 

unit  vector  along  R  and  let  c .  be  the  scalar  product  of  and 

R  (see  section  2.2  for  these  calculations).  The  component  of  v^^ 
along  R  is  then  uc  and  y .  ,  the  component  at  right  angles , 

~  i  l 

is  ”  vi  “  uc  .  To  apply  the  Rayleigh  test,  we  then  reduce 

each  vector  y^  to  unit  length  as  shown  in  Section  2.2;  let  the  unit 

vector  be  with  components  zn'zi2'  ”  ”zip}  the  resultant  z  °f  this 

set  of  vectors  has  components  z1'z2'  *  •  •  »Zp'  where  Z.,  =  and  the 

2  2  2  ^ 

length  Z  of  Z  is  given  by  Z  =  z^  +  z2  +  • * •  +  Zp  •  0n  nul1 

hypothesis  that  the  vectors  z.  are  uniform  in  the  p-1  dimensional 

2 

subspace,  the  test  statistic  T  =  (p-l)Z  /N  is  asymptotically 
2 

distributed  as  x  with  p-1  degrees  of  freedom;  the  hypothesis  of 

2 

uniformity  is  rejected  if  T  is  larger  than  The  two  tests 

above, taken  together,  provide  a  good  omnibus  test  that  the  original 
saaple  of  vectors  v^  comes  from  the  von  Mises  distribution.  The 


distributional  tests  are  applied  to  each  cell  of  the  two  way  analysis 
of  variance  table  described  above,  analogous  to  applying  tests  for 
normality  in  the  usual  analysis  of  variance  table.  If,  of  course, 
there  were  to  be  a  significant  difference  in  the  modal  vector,  say 
between  men  and  women,  a  test  for  the  von  Mises  distribution  applied 
to  the  complete  sample,  including  both  men  and  women,  might  well  be 
rejected.  On  the  other  hand,  if  each  group  is  found  to  have  a  von 
Mises  distribution,  the  test  for  common  modal  vector  can  be  applied, 
and  if  accepted,  the  overall  sample  should  have  a  von  Mises 
distribution.  For  the  student  data  in  Table  3,  all  the  groups  gave 
far  from  significant  values  for  the  test  statistics  for  goodness-of-fit, 
so  that  the  von  Mises  distribution  appears  to  fit  the  data  well. 

4.4.  Clustering  and  correlation. 

The  scalar  product  s. .  **  s(v  ,v.)  is  a  convenient 

1 J  ^  J 

measure  of  the  closeness  of  the  vectors  v^  and  v^ ;  s  _  takes 
values  between  -1  and  1.  We  can  call  s^  a  proximity  measure,  and 
the  matrix  S  ,  with  entries  si;.,  a  proximity  matrix.  A  cluster 
may  then  be  defined  as  containing  all  points  for  which  s^  is 
greater  than  rQ,  for  a  suitable  rQ,  or  by  using  some  similar 
algorithm.  This  proximity  measure  has  been  used  on  some  economic 
data  in  an  M.Sc.  thesis  (Holguin,  1980) ;  data  on  the  proportions 
of  different  wood  products  produced  by  Canadian  and  U.S.  lumber 
companies  were  provided  by  Dr.  R.  Schwindt  of  this  University  and  the 
companies  were  examined  both  for  differences  between  groups  and  also 
to  find  clusters.  In  a  second  data  set,  taken  from  a  U.H.  publication, 
countries  were  clustered  according  to  the  proportions  of  certain  staple 


foods  in  the  national  diet. 
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When  two  sets  of  vectors  can  be  logically  paired,  the  techniques 
of  correlation  developed  by  Stephens  (1979)  may  be  useful.  These  have 
not  been  illustrated  here  because  they  did  not  appear  to  be  applicable 
to  the  student  data. 


5.  CONCLUDING  REMARKS 

(a)  It  is  hoped  that  the  methodology  developed  above  will  be  useful 
for  the  analysis  of  data  which  it  is  convenient  to  record  as  a  set  of 
unit  vectors;  the  example  illustrated  is  that  of  continuous  proportions. 

The  special  feature  of  the  technique  is  that  it  incorporates  the  constraint 
expressed  by  =  1  in  a  natural  way.  Note  that  the  order  of 

proportions,  e.g.  the  order  of  labelling  student  activities,  does 

not  affect  the  analysis.  For  reasons  of  space,  only  one  such 
example  has  been  discussed  in  detail  but  analyses  similar  to  the  above 
could  be  applied  to  the  proportions  of  different  minerals  in  an  ore 
deposit,  calculated  by  volume  or  by  mass,  or  the  proportions  of 
different  products  in  the  total  output  of  a  company,  the  proportions 
of  the  area  of  a  city  used  for  different  purposes,  etc.  Several 
examples  on  these  lines  have  been  suggested  to  the  author,  and  they 
will  be  followed  up  in  later  case  studies. 

(b)  For  proportions  analysis,  the  components  of  sample  vectors 
are  naturally  all  positive,  so  that  the  vectors  in  p-diraensions  are 
tightly  clustered;  hence  we  obtain  the  high  k  values  seen  with  the 
student  data,  and  the  analysis  of  variance  technique  works  very  well. 
Nevertheless,  the  robustness  of  these  methods  needs  further  exploration. 
For  example,  the  effect  must  be  determined  of  specifying  too  many 
components  for  v^  ,  e.g.  too  many  student  activities,  or  too  many 
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subdivisions  of  output  of  a  lumber  company,  so  that  some  components 
are  zero  for  many  of  the  sample  vectors.  The  Rayleigh  test  should 
detect  if  the  effective  reduction  in  dimensions  which  this  produces 
has  a  strong  influence. 

(c)  The  methodology  described  is  not  primarily  intended  for  proportions 
which  come  from  counted  data,  such  as  proportions  of  a  sample  of  voters 
expressing  different  political  preferences.  Proportions  of  this  type 
are  usually  examined  for  homogeneity  in  contingency  tables  (though 
not  of  course  using  the  proportions  themselves) ,  and  each  group  of 
voters  represents  an  independent  sample.  Even  if  they  were  expressed 
as  unit  vectors,  the  model  behind  the  counted  data  is  such  that  it  would 
not  necessarily  be  appropriate  to  regard  the  vectors  as  from  a 
von  Mises  distribution.  However,  if  the  samples  of  voters  could  be 
naturally  grouped,  say  by  regions,  and  especially  if  each  voter  sample 
were  of  the  same  size,  some  of  the  above  techniques  might  be  useful 


in  exploring  the  data.  For  example  the  scalar  product  proximity 
measure  s„  discussed  in  Section  4.4  could  be  a  practical  measure 
of  the  similarity  of  two  patterns  of  voter  preferences. 

The  pitfalls  of  proportions,  which  have  often  been  emphasized  in 
discussions  of  contingency  tables ,  should  again  be  stressed.  The 
author  has  seen,  for  example,  a  proposal  to  analyze  proportions  of 
land  use  in  major  cities,  one  of  the  examples  briefly  mentioned  above, 
using  contingency  tables,  on  the  grounds  that  the  numbers  were"counted 
data";  they  had  been  obtained  by  superimposing  a  fine  grid  on  the  map 


and  counting  squares.  It  is  not  always  easy,  especially  for  the 


applied  worker  with  only  a  limited  knowledge  of  a  contingency  table 
model,  to  distinguish  between  the  two  types  of  data  when  presented 
as  proportions. 


3cr* 


(d)  It  may  be  seen  that  there  is  no  shortage  of  interesting  examples 
where  the  above  methodology  may  be  applied.  The  best  test  of  its 
effectiveness  will  be  in  these  practical  applications  and  especially 
in  comparisons  with  other  techniques  of  analysis.  In  this  way  any 
difficulties  of  application,  especially  concerning  robustness  of  the 
methods,  will  hopefully  come  to  light.  A  number  of  such  comparisons 
have  been  started,  and  it  is  hoped  in  a  later  paper  to  report  on  several 
case  studies  in  different  fields. 
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computations 


TABLE  1 


Analysis  of  activity  pattern  of  130 

students. 

Overall  resultant  length 

(130  vectors) 

R  = 

117.199. 

Part  (a)  Analysis 

of  differences  between 

men 

and  women. 

Group 

N. 

1 

Ri 

A 

ki 

1  (Women) 

56 

50.504 

35.67 

2  (Men) 

74 

66.754 

37.74 

Total 

130 

117.258 

ANOVA 

,  Table 

Variance  Component 

Value 

d.f.  Mean  Component 

Between  groups 

E.R.-R 
l  r 

0.059 

7  .0086 

Within  groups 

N  -  Z.R. 
i  i 

12.742 

896  .0142 

Total 

12.801 

903 

Part  (b)  Analysis  of  differences  by  living  style. 


Group 

N, 

1 

R. 

l 

k. 

1 

1 

18 

16.317 

37.43 

2 

28 

25.190 

34.87 

3 

84 

76.293 

38.13 

Total 

130 

117.801 

ANOVA  Table. 

Variance  Component 

Value 

d.f. 

Mean  Component 

Between  groups 

I.R.-R 

l  l 

0.602 

14 

.043 

Within  groups 

N  -  ZiRi 

12.199 

889 

.0137 

N  -  R 

12.801 

903 
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TABLE  2 


Two  way  analysis  of  variance  for  resultant  vectors, 
(a)  Table  of  resultants. 

Classification  2 


:lassification  1 
rows 


1 

2 

1 


(Columns) 


Vector 


1 

2 

3  4 

• 

Total 

R11 

*12 

*13  *14 

*1- 

R 

R 

R  - 

R„ 

21 

22 

23 

2- 

*11 

*1. 

R 


Note  that  the  vector  total  at  the  end  of  a  row  or  column  is  not  the 
arithmetic  total  of  the  entries  of  that  row  or  column. 


(b)  ANOVA  Table. 


Variance  Component 

Value 

d.f. 

Between  rows 

I 

l  "r 

(I-l)r 

Between  cols,  within 

row  1 

i-1 

Ji 

}l  VV 

(Jr1)r 

• 

Between  cols,  within 

row  1 

(Jj-Dr 

Within  groups 

i  Ji 

N  -  l  l 

i-1  j-1  12 

(N-N*)r 

N-R 


Total 


(N-l)r 


TABU  3 


(a) 


(b) 


Results  for  130  students  classified  by  sex  and  by  three  living 
styles . 

Each  cell  shows  the  number  of  students  and  the  resultant  lengths 
for  the  group.  R^>  =  length  of  resultant  of  all  students  in  row  i  , 
and  Raj  is  the  resultant  length  for  all  students  in 

column  j  .  Ra  #  is  the  length  of  the  resultant  of  all  130  vectors. 


Living  style  1 

2 

3 

Sex  F 

9  8.029 

13  11.758 

34  30.925 

50.504 

M 

9  8.387 

15  13.563 

50  45.394 

66.754 

R.  j 

16.317 

25.190 

76.293 

rjRij 

=  50.712 

£iRil  *  16 *416 

V-j  ■ 

117.800 

EjR2j 

=  67.344 

h*L2  =  25*221 

*iRi.  - 

117.258 

E.  .R. .  -  118.058 
xj  XJ 

E.R.  =  76.319 
x  x3 

R  *  117.198 
•  • 

anova  Table  for  analysis 

_1. 

Variance  Component 


Value 


d.f. 


M.C.  Test  Statistic 


Between  sexes 

ER.  -R 

i . 

0.060 

7 

.0086 

.62 

Between  styles,  F 

ERij-Ri- 

0.208 

14 

.0149 

1.08 

Between  styles,  H 

£R2J'R2- 

0.590 

14 

.0421 

3.06 

Within  groups 

N  -  El3RtJ 

11.942 

868 

.0138 

27 


(c)  ANOVA  Table  for  analysis  2. 


Variance  Component 

Between  styles 
Between  sexes,  Style  1 
Between  sexes.  Style  2 
Between  Sexes,  Style  3 
Within  groups 


Value  d.f 


ER  .  -R 
*3  *• 

0.602 

14 

ER..-R  , 
ll  *1 

0.099 

7 

ER.  -R  „ 

12  *2 

0.131 

7 

ER.  -R  , 
i3  *3 

0.026 

7 

N  -  E. .R. . 
13  ij 

11.942 

868 

M.C.  Test  Statistic 

.043  3.13 

.014  1.02 

.019  1.36 

.0037  .270 

.0138 
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-**8j 


.«*c. 


TABLE  4 


Vectors  determined  from  sociological  ratings. 

Each  cell  shows  the  number  of  respondents  and  the  resultant  length 
for  the  group. 


Respondents 


Canadian 

33 

28.59 

English 

Irish 

Scots 

R^  -  91.42 

British 

62 

15 

26 

N,  -  103 

55.19 

12.86 

23.49 

French 

41 

32.75 

German 

Russian 

Others 

R,  -  50.72 

4’ 

Others 

9 

5 

41 

N.  “  55 

4* 

8.25 

4.58 

38.06 

ANOVA  table 

df 

M.C. 

Statistic 

Between  rows 

203.48  -  195.79  -  7.69 

6 

1.28 

20.31 

Within  row  1 

— 

row  2 

91.54-  91.42  -  .12 

r(J2-l)  -  4 

.03 

.49 

row  3 

— 

row  4 

50.89-  50.72  -  .17 

a 

/-N 

H 

1 

** 

M 

.04 

.67 

Within  groups 

232-  203.77  -  28.23 

2(232-8)  -  448 

.063 

29 
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